510 Software Group

2014-03-01 SPF - is it useful

Is SPF useful, or is it snake oil?

Overview
Large vs small senders
Small senders vs small spammers
SPF based whitelisting
SPF based blacklisting
Parsing SPF records
Summary

Overview

The only reason we design, code and use things like spf, dkim and dns based blacklists or whitelists of ip addresses or domains is to reject spam. Any spam control element needs to help with at least one of two measures. Either that element directly helps reject spam, or it helps to whitelist wanted mail (ham). Spam control elements that help with whitelisting indirectly help reject spam, since those whitelisting elements then allow the use of more draconian filtering elements to reject spam.

If there were no spam email, we would not need any of these, and the job of a mail server would be much simpler - just accept all incoming mail and put it in the proper mailbox.

But the world is not that simple, and we do have a flood of spam that needs to be rejected. So where can spf help mail systems reduce the spam delivered to user mailboxes? Consider an spf record for example.com like "v=spf1, ip4:10.1.1.0/24 ~all". We can interpret that as meaning that much of the mail from example.com will originate from 10.1.1.0/24, but that some may be forwarded and so appear to originate from other addresses.

By itself, this tells us nothing about the spam/ham properties of mail from example.com, or from that /24 block.

Large vs small senders

Consider three groups of mail senders. In the first group we have the top 50 senders by volume (google, yahoo, hotmail, etc). In the second group we have the next 500 senders by volume. And in the third group we have the next 500000 senders by volume.

Anyone can manually come up with their own list of group one senders, and can use the spf records from those domains to whitelist, or at least avoid blacklisting, the main mail servers of those group one domains. In that sense, spf is useful, since it helps the group one domains publish a list of their outbound mail servers, in a standard format that is easily parsed.

Note that this usage of spf data is not directly reducing spam in user inboxes. Instead, it is a method to reduce the false positives that may otherwise be generated by draconian spam filtering using other methods.

But no one is going to manually maintain a list of the domains in group three, there are too many of them. And you cannot just whitelist any domain that happens to publish an spf record, since spammers can (and do) publish those.

Small senders vs small spammers

So how do you tell the difference (using spf) between a small sender like abqpubco.com, and a presumed spammer like uggfrom.com? As of 2013-12-30, those domains have the following dns records:

abqpubco.com. 172003 IN TXT "v=spf1 ip4:216.85.20.0/24 a mx a:web2.abqjournal.com include:eblastengine.com ~all"

uggfrom.com. 600 IN TXT "v=spf1 ip4:23.225.179.0/24 -all"

23.225.179.1 179ppr1.merchantd.com.
23.225.179.2 179ppr2.merchantd.com.
23.225.179.3 179ppr3.merchantd.com.
...
23.225.179.253 179ppr253.merchantd.com.
23.225.179.254 179ppr254.merchantd.com.
23.225.179.255 179ppr255.merchantd.com.

merchantd.com.dbl.spamhaus.org has address 127.0.1.2 uggfrom.com.multi.surbl.org has address 127.0.0.66 uggfrom.com.dbl.spamhaus.org has address 127.0.1.2

I don't think there is any reasonable way to determine the difference (with respect to spam/ham expectations) between abqpubco.com and uggfrom.com from their spf records alone. However, if you have some other information, such as a whitelisting request from your customer for abqpubco.com, or a spam sample or the existence of a DBL listing for uggfrom.com - that can be combined with the spf data to improve your filtering.

SPF based whitelisting

Suppose you have a request from your customer to whitelist mail from abqpubco.com. You could whitelist all mail with envelope from *@abqpubco.com, but if spammers use that domain, you might end up whitelisting a lot of spam. You could whitelist all mail from their current outbound mail servers, but if those ip addresses are shared with other users, you might end up whitelisting a lot of spam. Those addresses will probably also change, so this is a maintenance problem for you. You could use their spf data to whitelist mail from that domain that is also arriving from their ip addresses as specified in their spf record. That will cut down on the chances of whitelisting a bunch of spam, and it pushes the maintenance problem off on the owners of abqpubco.com.

For the purpose of such spf based whitelisting, you could ignore the weak ~all mechanism on the abqpubco.com txt spf record. That would result in whitelisting mail from abqpubco.com, but only from the ip addresses that are explicitly listed in their txt spf record.

Again, this usage of spf data is not directly reducing spam in user inboxes, but is instead reducing the false positives generated by other spam control methods.

SPF based blacklisting

Suppose you have a spam sample with an envelope from in the domain uggfrom.com. You may have rejected that message based on the DBL or SURBL listing of that domain, and you now wish to reject all mail from their mail servers. You could use their spf record above to extend your blacklisting to include 23.225.179.0/24. Although the spammers are publishing spf data in the hopes of getting reduced spam filtering, we can use that same data to block them. This process does need manual oversight, since spammers could easily include third party ip addresses in their spf record, hoping that you would then block mail from that third party.

Parsing SPF records

Of course there are many folks that create badly formatted spf and other dns records. Some of the ones that I have seen include:

The long time favorite, MX records pointing to CNAMEs
Names in the ip4 mechanism, "ip4:some.name.tld"
Recursive loops via the include mechanism

Summary

Is there any compelling reason for a domain in group three to publish spf records? I think the answer depends on whether anyone needs to specifically whitelist your mail, and on whether your mail originates from a large pool like godaddy or google, or whether you have your own ip address space and your own mail server.

If mail for your small domain is handled by godaddy / google / outlook.com or equivalent, then folks won't really be able to whitelist your mail by ip address, or reverse dns name. In this case, publishing an spf record allows them to whitelist the combination of (envelope from, ip address).

If you have your own ip address space, and run your own mail server, it should be just as easy for them to whitelist you by reverse dns name.

You should not feel lonely if you don't publish spf records. There are many other folks that have made that same decision, including these.

yahoo.com
aaa.com
aero.org
cbsnews.com
free.fr
harvard.edu
med.cornell.edu
reuters.com
theguardian.com
uh.edu

Of course, some of those domains may send no email.